AITopics | mean squared error

Collaborating Authors

mean squared error

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Appendix

Neural Information Processing SystemsApr-25-2026, 03:48:18 GMT

In this appendix, we first introduce the datasets and evaluation metrics used in the experiments in Section A. Then, we provide extra experimental results in Section B. In Section C, we present details of network design, training scheme, and hyper-parameter tuning. We conduct experiments on 11 popular time series datasets: (1) Electricity Transformer Temperature [42] (ETTh(1,2),ETTm1) 3consists of 2 year electric power data collected from two separated counties of China. Each data point includes an "oil temperature" value and 6 power load features. The data is aggregated into 5-minutes windows, resulting in 12 points per hour and 288 points per day. A.1 Electricity Transformer Temperature (ETT) For data pre-processing, we perform zero-mean normalization, i.e., X We use Mean Absolute Errors (MAE) [17] and Mean Squared Errors (MSE) [26] for model comparison.

artificial intelligence, dataset, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.29)

Industry:

Energy > Power Industry (1.00)
Energy > Renewable > Solar (0.33)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Adaptive Normalization Mamba with Multi Scale Trend Decomposition and Patch MoE Encoding

Jeon, MinCheol

arXiv.org Artificial IntelligenceDec-9-2025

Time series forecasting in real world environments faces significant challenges non stationarity, multi scale temporal patterns, and distributional shifts that degrade model stability and accuracy. This study propose AdaMamba, a unified forecasting architecture that integrates adaptive normalization, multi scale trend extraction, and contextual sequence modeling to address these challenges. AdaMamba begins with an Adaptive Normalization Block that removes non stationary components through multi scale convolutional trend extraction and channel wise recalibration, enabling consistent detrending and variance stabilization. The normalized sequence is then processed by a Context Encoder that combines patch wise embeddings, positional encoding, and a Mamba enhanced Transformer layer with a mixture of experts feed forward module, allowing efficient modeling of both long range dependencies and local temporal dynamics. A lightweight prediction head generates multi horizon forecasts, and a denormalization mechanism reconstructs outputs by reintegrating local trends to ensure robustness under varying temporal conditions. AdaMamba provides strong representational capacity with modular extensibility, supporting deterministic prediction and compatibility with probabilistic extensions. Its design effectively mitigates covariate shift and enhances predictive reliability across heterogeneous datasets. Experimental evaluations demonstrate that AdaMamba's combination of adaptive normalization and expert augmented contextual modeling yields consistent improvements in stability and accuracy over conventional Transformer based baselines.

artificial intelligence, forecasting, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2512.06929

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

xLSTMAD: A Powerful xLSTM-based Method for Anomaly Detection

Faber, Kamil, Pietroń, Marcin, Żurek, Dominik, Corizzo, Roberto

arXiv.org Artificial IntelligenceNov-14-2025

The recently proposed xLSTM is a powerful model that leverages expressive multiplicative gating and residual connections, providing the temporal capacity needed for long-horizon forecasting and representation learning. This architecture has demonstrated success in time series forecasting, lossless compression, and even large-scale language modeling tasks, where its linear memory footprint and fast inference make it a viable alternative to Transformers. Despite its growing popularity, no prior work has explored xLSTM for anomaly detection. In this work, we fill this gap by proposing xLSTMAD, the first anomaly detection method that integrates a full encoder-decoder xLSTM architecture, purpose-built for multivariate time series data. Our encoder processes input sequences to capture historical context, while the decoder is devised in two separate variants of the method. In the forecasting approach, the decoder iteratively generates forecasted future values xLSTMAD-F, while the reconstruction approach reconstructs the input time series from its encoded counterpart xLSTMAD-R. We investigate the performance of two loss functions: Mean Squared Error (MSE), and Soft Dynamic Time Warping (SoftDTW) to consider local reconstruction fidelity and global sequence alignment, respectively. We evaluate our method on the comprehensive TSB-AD-M benchmark, which spans 17 real-world datasets, using state-of-the-art challenging metrics such as VUS-PR. In our results, xLSTM showcases state-of-the-art accuracy, outperforming 23 popular anomaly detection baselines. Our paper is the first work revealing the powerful modeling capabilities of xLSTM for anomaly detection, paving the way for exciting new developments on this subject. Our code is available at: https://github.com/Nyderx/xlstmad

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2506.22837

Country: North America > United States (0.46)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.68)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Conditional Neural ODE for Longitudinal Parkinson's Disease Progression Forecasting

Wang, Xiaoda, Zhao, Yuji, Han, Kaiqiao, Luo, Xiao, van Rooij, Sanne, Stevens, Jennifer, He, Lifang, Zhan, Liang, Sun, Yizhou, Wang, Wei, Yang, Carl

arXiv.org Artificial IntelligenceNov-10-2025

Parkinson's disease (PD) shows heterogeneous, evolving brain-morphometry patterns. Modeling these longitudinal trajectories enables mechanistic insight, treatment development, and individualized 'digital-twin' forecasting. However, existing methods usually adopt recurrent neural networks and transformer architectures, which rely on discrete, regularly sampled data while struggling to handle irregular and sparse magnetic resonance imaging (MRI) in PD cohorts. Moreover, these methods have difficulty capturing individual heterogeneity including variations in disease onset, progression rate, and symptom severity, which is a hallmark of PD. To address these challenges, we propose CNODE (Conditional Neural ODE), a novel framework for continuous, individualized PD progression forecasting. The core of CNODE is to model morphological brain changes as continuous temporal processes using a neural ODE model. In addition, we jointly learn patient-specific initial time and progress speed to align individual trajectories into a shared progression trajectory. We validate CNODE on the Parkinson's Progression Markers Initiative (PPMI) dataset. Experimental results show that our method outperforms state-of-the-art baselines in forecasting longitudinal PD progression.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.04789

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Parkinson's Disease (1.00)
Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Distributionally Robust Feature Selection

Swaroop, Maitreyi, Krishnamurti, Tamar, Wilder, Bryan

arXiv.org Artificial IntelligenceOct-27-2025

We study the problem of selecting limited features to observe such that models trained on them can perform well simultaneously across multiple subpopulations. This problem has applications in settings where collecting each feature is costly, e.g. requiring adding survey questions or physical sensors, and we must be able to use the selected features to create high-quality downstream models for different populations. Our method frames the problem as a continuous relaxation of traditional variable selection using a noising mechanism, without requiring backpropagation through model training processes. By optimizing over the variance of a Bayes-optimal predictor, we develop a model-agnostic framework that balances overall performance of downstream prediction across populations. We validate our approach through experiments on both synthetic datasets and real-world data.

artificial intelligence, machine learning, selection, (16 more...)

arXiv.org Artificial Intelligence

2510.21113

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (0.68)
Health & Medicine > Therapeutic Area (0.46)
Health & Medicine > Health Care Providers & Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

One Whisper to Grade Them All

Phan, Nhan, Porwal, Anusha, Getman, Yaroslav, Voskoboinik, Ekaterina, Grósz, Tamás, Kurimo, Mikko

arXiv.org Artificial IntelligenceOct-7-2025

We present an efficient end-to-end approach for holistic Automatic Speaking Assessment (ASA) of multi-part second-language tests, developed for the 2025 Speak & Improve Challenge. Our system's main novelty is the ability to process all four spoken responses with a single Whisper-small encoder, combine all information via a lightweight aggregator, and predict the final score. This architecture removes the need for transcription and per-part models, cuts inference time, and makes ASA practical for large-scale Computer-Assisted Language Learning systems. Our system achieved a Root Mean Squared Error (RMSE) of 0.384, outperforming the text-based baseline (0.44) while using at most 168M parameters (about 70% of Whisper-small). Furthermore, we propose a data sampling strategy, allowing the model to train on only 44.8% of the speakers in the corpus and still reach 0.383 RMSE, demonstrating improved performance on imbalanced classes and strong data efficiency.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.21437/SLaTE.2025-12

2507.17918

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (1.00)

Industry: Education > Curriculum > Subject-Specific Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)

Add feedback

FHRFormer: A Self-supervised Transformer Approach for Fetal Heart Rate Inpainting and Forecasting

Engan, Kjersti, Kanwal, Neel, Yeconia, Anita, Blacy, Ladislaus, Munyaw, Yuda, Mduma, Estomih, Ersdal, Hege

arXiv.org Artificial IntelligenceSep-26-2025

Approximately 10\% of newborns require assistance to initiate breathing at birth, and around 5\% need ventilation support. Fetal heart rate (FHR) monitoring plays a crucial role in assessing fetal well-being during prenatal care, enabling the detection of abnormal patterns and supporting timely obstetric interventions to mitigate fetal risks during labor. Applying artificial intelligence (AI) methods to analyze large datasets of continuous FHR monitoring episodes with diverse outcomes may offer novel insights into predicting the risk of needing breathing assistance or interventions. Recent advances in wearable FHR monitors have enabled continuous fetal monitoring without compromising maternal mobility. However, sensor displacement during maternal movement, as well as changes in fetal or maternal position, often lead to signal dropouts, resulting in gaps in the recorded FHR data. Such missing data limits the extraction of meaningful insights and complicates automated (AI-based) analysis. Traditional approaches to handle missing data, such as simple interpolation techniques, often fail to preserve the spectral characteristics of the signals. In this paper, we propose a masked transformer-based autoencoder approach to reconstruct missing FHR signals by capturing both spatial and frequency components of the data. The proposed method demonstrates robustness across varying durations of missing data and can be used for signal inpainting and forecasting. The proposed approach can be applied retrospectively to research datasets to support the development of AI-based risk algorithms. In the future, the proposed method could be integrated into wearable FHR monitoring devices to achieve earlier and more robust risk detection.

data mining, fhrformer, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2509.20852

Country: Europe > Norway (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Magnitude Matters: a Superior Class of Similarity Metrics for Holistic Semantic Understanding

Parupudi, V. S. Raghu

arXiv.org Artificial IntelligenceSep-25-2025

Vector comparison in high dimensions is a fundamental task in NLP, yet it is dominated by two baselines: the raw dot product, which is unbounded and sensitive to vector norms, and the cosine similarity, which discards magnitude information entirely. This paper challenges both standards by proposing and rigorously evaluating a new class of parameter-free, magnitude-aware similarity metrics. I introduce two such functions, Overlap Similarity (OS) and Hyperbolic Tangent Similarity (HTS), designed to integrate vector magnitude and alignment in a more principled manner. To ensure that my findings are robust and generalizable, I conducted a comprehensive evaluation using four state-of-the-art sentence embedding models (all-MiniLM-L6-v2, all-mpnet-base-v2, paraphrase-mpnet-base-v2, and BAAI/bge-large-en-v1.5) across a diverse suite of eight standard NLP benchmarks, including STS-B, SICK, Quora, and PAWS. Using the Wilcoxon signed-rank test for statistical significance, my results are definitive: on the tasks requiring holistic semantic understanding (paraphrase and inference), both OS and HTS provide a statistically significant improvement in Mean Squared Error over both the raw dot product and cosine similarity, regardless of the underlying embedding model.Crucially, my findings delineate the specific domain of advantage for these metrics: for tasks requiring holistic semantic understanding like paraphrase and inference, my magnitude-aware metrics offer a statistically superior alternative. This significant improvement was not observed on benchmarks designed to test highly nuanced compositional semantics (SICK, STS-B), identifying the challenge of representing compositional text as a distinct and important direction for future work.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2509.19323

Country: North America > United States > Minnesota (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Knowledge-Guided Adaptive Mixture of Experts for Precipitation Prediction

Jiang, Chen, Osei, Kofi, Yeddula, Sai Deepthi, Feng, Dongji, Ku, Wei-Shinn

arXiv.org Artificial IntelligenceSep-16-2025

Accurate precipitation forecasting is indispensable in agriculture, disaster management, and sustainable strategies. However, predicting rainfall has been challenging due to the complexity of climate systems and the heterogeneous nature of multi-source observational data, including radar, satellite imagery, and surface-level measurements. The multi-source data vary in spatial and temporal resolution, and they carry domain-specific features, making it challenging for effective integration in conventional deep learning models. Previous research has explored various machine learning techniques for weather prediction; however, most struggle with the integration of data with heterogeneous modalities. To address these limitations, we propose an Adaptive Mixture of Experts (MoE) model tailored for precipitation rate prediction. Each expert within the model specializes in a specific modality or spatio-temporal pattern. We also incorporated a dynamic router that learns to assign inputs to the most relevant experts. Our results show that this modular design enhances predictive accuracy and interpretability. In addition to the modeling framework, we introduced an interactive web-based visualization tool that enables users to intuitively explore historical weather patterns over time and space. The tool was designed to support decision-making for stakeholders in climate-sensitive sectors. We evaluated our approach using a curated multimodal climate dataset capturing real-world conditions during Hurricane Ian in 2022. The benchmark results show that the Adaptive MoE significantly outperformed all the baselines.

artificial intelligence, machine learning, prediction, (17 more...)

arXiv.org Artificial Intelligence

2509.11459

Country: North America > United States > Florida (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Computational Fluid Dynamics Optimization of F1 Front Wing using Physics Informed Neural Networks

Shah, Naval

arXiv.org Artificial IntelligenceSep-3-2025

In response to recent FIA regulations reducing Formula 1 team wind tunnel hours (from 320 hours for last-place teams to 200 hours for championship leaders) and strict budget caps of 135 million USD per year, more efficient aerodynamic development tools are needed by teams. Conventional computational fluid dynamics (CFD) simulations, though offering high fidelity results, require large computational resources with typical simulation durations of 8-24 hours per configuration analysis. This article proposes a Physics-Informed Neural Network (PINN) for the fast prediction of Formula 1 front wing aerodynamic coefficients. The suggested methodology combines CFD simulation data from SimScale with first principles of fluid dynamics through a hybrid loss function that constrains both data fidelity and physical adherence based on Navier-Stokes equations. Training on force and moment data from 12 aerodynamic features, the PINN model records coefficient of determination (R-squared) values of 0.968 for drag coefficient and 0.981 for lift coefficient prediction while lowering computational time. The physics-informed framework guarantees that predictions remain adherent to fundamental aerodynamic principles, offering F1 teams an efficient tool for the fast exploration of design space within regulatory constraints.

artificial intelligence, coefficient, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2509.01963

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Sports > Motorsports > Formula One (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.63)

Add feedback